AITopics | backward propagation

Collaborating Authors

backward propagation

Information about AI from the News, Publications, and Conferences

Automatic Classification – Tagging and Summarization – Customizable Filtering and Analysis

If you are looking for an answer to the question What is Artificial Intelligence? and you only have a minute, then here's the definition the Association for the Advancement of Artificial Intelligence offers on its home page: "the scientific understanding of the mechanisms underlying thought and intelligent behavior and their embodiment in machines."

However, if you are fortunate enough to have more than a minute, then please get ready to embark upon an exciting journey exploring AI (but beware, it could last a lifetime) …

DropBP: Accelerating Fine-Tuning of Large Language Models by Dropping Backward Propagation

Neural Information Processing SystemsMar-19-2026, 02:06:01 GMT

Large language models (LLMs) have achieved significant success across various domains. However, training these LLMs typically involves substantial memory and computational costs during both forward and backward propagation. While parameter-efficient fine-tuning (PEFT) considerably reduces the training memory associated with parameters, it does not address the significant computational costs and activation memory. In this paper, we propose Dropping Backward Propagation (DropBP), a novel approach designed to reduce computational costs and activation memory while maintaining accuracy. DropBP randomly drops layers during backward propagation, which is essentially equivalent to training shallow submodules generated by undropped layers and residual connections. Additionally, DropBP calculates the sensitivity of each layer to assign an appropriate drop rate, thereby stabilizing the training process. DropBP is not only applicable to full fine-tuning but can also be orthogonally integrated with all types of PEFT by dropping layers during backward propagation. Specifically, DropBP can reduce training time by 44% with comparable accuracy to the baseline, accelerate convergence to the same perplexity by 1.5$\times$, and enable training with a sequence length 6.2$\times$ larger on a single NVIDIA-A100 GPU.

large language model, machine learning, natural language, (11 more...)

Neural Information Processing Systems

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (0.85)
Information Technology > Artificial Intelligence > Machine Learning (0.76)

Add feedback

80f2f15983422987ea30d77bb531be86-Supplemental.pdf

Neural Information Processing SystemsFeb-19-2026, 05:09:22 GMT

backward propagation, forward propagation, propagation, (17 more...)

Neural Information Processing Systems

Genre: Research Report > New Finding (0.34)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Neural Networks (1.00)

Add feedback

80f2f15983422987ea30d77bb531be86-Paper.pdf

Neural Information Processing SystemsFeb-19-2026, 05:09:18 GMT

Wethenseparate theoptimization process into two steps, corresponding to weight update and structure parameter update. For the former step, we use the conventional chain rule, which can be sparse via exploiting the sparse structure.

artificial intelligence, machine learning, neural network, (18 more...)

Neural Information Processing Systems

Genre: Research Report (0.47)

Technology: Information Technology > Artificial Intelligence > Machine Learning > Statistical Learning (0.97)

Add feedback

240225294cdd2c9b692c2519d3278a08-Paper-Conference.pdf

Neural Information Processing SystemsFeb-9-2026, 12:57:39 GMT

backward propagation, drop rate, dropbp, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(15 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Rethinking the Backward Propagation for Adversarial Transferability

Neural Information Processing SystemsDec-23-2025, 18:48:26 GMT

Transfer-based attacks generate adversarial examples on the surrogate model, which can mislead other black-box models without access, making it promising to attack real-world applications. Recently, several works have been proposed to boost adversarial transferability, in which the surrogate model is usually overlooked. In this work, we identify that non-linear layers (e.g., ReLU, max-pooling, etc.) truncate the gradient during backward propagation, making the gradient w.r.t.

adversarial transferability, backward propagation, name change, (8 more...)

Neural Information Processing Systems

Technology: Information Technology > Artificial Intelligence (0.83)

Add feedback

VeriLoRA: Fine-Tuning Large Language Models with Verifiable Security via Zero-Knowledge Proofs

Liao, Guofu, Wang, Taotao, Zhang, Shengli, Zhang, Jiqun, Long, Shi, Tao, Dacheng

arXiv.org Artificial IntelligenceDec-3-2025

Fine-tuning large language models (LLMs) is crucial for adapting them to specific tasks, yet it remains computationally demanding and raises concerns about correctness and privacy, particularly in untrusted environments. Although parameter-efficient methods like Low-Rank Adaptation (LoRA) significantly reduce resource requirements, ensuring the security and verifiability of fine-tuning under zero-knowledge constraints remains an unresolved challenge. To address this, we introduce VeriLoRA, the first framework to integrate LoRA fine-tuning with zero-knowledge proofs (ZKPs), achieving provable security and correctness. VeriLoRA employs advanced cryptographic techniques -- such as lookup arguments, sumcheck protocols, and polynomial commitments -- to verify both arithmetic and non-arithmetic operations in Transformer-based architectures. The framework provides end-to-end verifiability for forward propagation, backward propagation, and parameter updates during LoRA fine-tuning, while safeguarding the privacy of model parameters and training data. Leveraging GPU-based implementations, VeriLoRA demonstrates practicality and efficiency through experimental validation on open-source LLMs like LLaMA, scaling up to 13 billion parameters. By combining parameter-efficient fine-tuning with ZKPs, VeriLoRA bridges a critical gap, enabling secure and trustworthy deployment of LLMs in sensitive or untrusted environments.

large language model, machine learning, natural language, (22 more...)

arXiv.org Artificial Intelligence

2508.21393

Country:

Asia > China (0.46)
North America > United States (0.28)

Genre: Research Report (0.64)

Industry: Information Technology > Security & Privacy (0.93)

Technology:

Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

c21f4ce780c5c9d774f79841b81fdc6d-Paper.pdf

Neural Information Processing SystemsNov-15-2025, 06:22:32 GMT

application, intervention, variational inequality, (14 more...)

Neural Information Processing Systems

Country:

North America > United States > California > Los Angeles County > Los Angeles (0.04)
North America > Canada (0.04)
Europe > United Kingdom > England > Oxfordshire > Oxford (0.04)
(2 more...)

Industry:

Transportation > Infrastructure & Services (1.00)
Government (1.00)
Social Sector (0.68)

Technology:

Information Technology > Game Theory (1.00)
Information Technology > Artificial Intelligence > Representation & Reasoning > Agents (1.00)
Information Technology > Artificial Intelligence > Machine Learning (1.00)
(3 more...)

Add feedback

240225294cdd2c9b692c2519d3278a08-Paper-Conference.pdf

Neural Information Processing SystemsOct-9-2025, 21:03:40 GMT

backward propagation, drop rate, dropbp, (15 more...)

Neural Information Processing Systems

Country:

North America > United States > Minnesota > Hennepin County > Minneapolis (0.14)
North America > United States > Louisiana > Orleans Parish > New Orleans (0.04)
Europe > Italy > Calabria > Catanzaro Province > Catanzaro (0.04)
(17 more...)

Genre: Research Report > Experimental Study (1.00)

Industry: Education (0.46)

Technology:

Information Technology > Artificial Intelligence > Representation & Reasoning (1.00)
Information Technology > Artificial Intelligence > Natural Language > Large Language Model (1.00)
Information Technology > Artificial Intelligence > Machine Learning > Neural Networks > Deep Learning (1.00)

Add feedback

Gradient Estimation Methods of Approximate Multipliers for High-Accuracy Retraining of Deep Learning Models

Meng, Chang, Burleson, Wayne, De Micheli, Giovanni

arXiv.org Artificial IntelligenceSep-16-2025

--Approximate multipliers (AppMults) are widely used in deep learning accelerators to reduce their area, delay, and power consumption. However, AppMults introduce arithmetic errors into deep learning models, necessitating a retraining process to recover accuracy. A key step in retraining is computing the gradient of the AppMult, i.e., the partial derivative of the approximate product with respect to each input operand. Existing approaches typically estimate this gradient using that of the accurate multiplier (AccMult), which can lead to suboptimal retraining results. T o address this, we propose two methods to obtain more precise gradients of AppMults. The first, called LUT -2D, characterizes the AppMult gradient with 2-dimensional lookup tables (LUTs), providing fine-grained estimation and achieving the highest retraining accuracy. The second, called LUT -1D, is a compact and more efficient variant that stores gradient values in 1-dimensional LUTs, achieving comparable retraining accuracy with shorter runtime. Experimental results show that on CIF AR-10 with convolutional neural networks, our LUT -2D and LUT -1D methods improve retraining accuracy by 3.83% and 3.72% on average, respectively. On ImageNet with vision transformer models, our LUT -1D method improves retraining accuracy by 23.69% on average, compared to a state-of-the-art retraining framework. Modern artificial intelligence ( AI) technologies excel in a wide range of areas such as natural language processing and computer vision. However, this rapid growth raises serious concerns about power consumption [1]. To achieve energy-efficient deep learning accelerators, researchers have adopted an emerging design paradigm called approximate computing, which reduces power consumption at the cost of errors [2], [3]. Approximate computing is particularly suitable for deep learning accelerators, since they are inherently resilient to errors and noise.

appmult, artificial intelligence, machine learning, (17 more...)

arXiv.org Artificial Intelligence

2509.10519

Country: